AITopics

2604.02887

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsMar-20-2026, 05:02:45 GMT

On the Sparsity of the Strong Lottery Ticket Hypothesis

Considerable research efforts have recently been made to show that a random neural network $N$ contains subnetworks capable of accurately approximating any given neural network that is sufficiently smaller than $N$, without any training. This line of research, known as the Strong Lottery Ticket Hypothesis (SLTH), was originally motivated by the weaker Lottery Ticket Hypothesis, which states that a sufficiently large random neural network $N$ contains sparse subnetworks that can be trained efficiently to achieve performance comparable to that of training the entire network $N$.Despite its original motivation, results on the SLTH have so far not provided any guarantee on the size of subnetworks.Such limitation is due to the nature of the main technical tool leveraged by these results, the Random Subset Sum (RSS) Problem.Informally, the RSS Problem asks how large a random i.i.d.

artificial intelligence, machine learning, proceedings, (9 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)

Jeffrey Pennington, Pratik Worah

Nonlinear random matrix theory for deep learning

Neural Information Processing SystemsNov-21-2025, 04:37:04 GMT

The list of successful applications of deep learning is growing at a staggering rate. Image recognition (Krizhevsky et al., 2012), audio synthesis (Oord et al., 2016), translation (Wu et al., 2016), and speech recognition (Hinton et al., 2012) are just a few of the recent achievements.

artificial intelligence, machine learning, neural network, (15 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsMay-27-2025, 12:23:14 GMT

RandNet-Parareal: a time-parallel PDE solver using Random Neural Networks

Parallel-in-time (PinT) techniques have been proposed to solve systems of time-dependent differential equations by parallelizing the temporal domain. Among them, Parareal computes the solution sequentially using an inaccurate (fast) solver, and then corrects'' it using an accurate (slow) integrator that runs in parallel across temporal subintervals. This work introduces RandNet-Parareal, a novel method to learn the discrepancy between the coarse and fine solutions using random neural networks (RandNets). RandNet-Parareal achieves speed gains up to x125 and x22 compared to the fine solver run serially and Parareal, respectively. Beyond theoretical guarantees of RandNets as universal approximators, these models are quick to train, allowing the PinT solution of partial differential equations on a spatial mesh of up to 10 5 points with minimal overhead, dramatically increasing the scalability of existing PinT approaches.

equation, randnet-parareal, time-parallel pde solver, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)

arXiv.org Machine LearningMay-23-2025

Critical Points of Random Neural Networks

Di Lillo, Simmaco

This work investigates the expected number of critical points of random neural networks with different activation functions as the depth increases in the infinite-width limit. Under suitable regularity conditions, we derive precise asymptotic formulas for the expected number of critical points of fixed index and those exceeding a given threshold. Our analysis reveals three distinct regimes depending on the value of the first derivative of the covariance evaluated at 1: the expected number of critical points may converge, grow polynomially, or grow exponentially with depth. The theoretical predictions are supported by numerical experiments. Moreover, we provide numerical evidence suggesting that, when the regularity condition is not satisfied (e.g. for neural networks with ReLU as activation function), the number of critical points increases as the map resolution increases, indicating a potential divergence in the number of critical points.

artificial intelligence, machine learning, neural network, (16 more...)

2505.17

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Lazio > Rome (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Muscarnera, Luca, Loreti, Luigi, Todeschini, Giovanni, Fumagalli, Alessio, Regazzoni, Francesco

Emergence of Structure in Ensembles of Random Neural Networks

arXiv.org Artificial IntelligenceMay-16-2025

Randomness is ubiquitous in many applications across data science and machine learning. Remarkably, systems composed of random components often display emergent global behaviors that appear deterministic, manifesting a transition from microscopic disorder to macroscopic organization. In this work, we introduce a theoretical model for studying the emergence of collective behaviors in ensembles of random classifiers. We argue that, if the ensemble is weighted through the Gibbs measure defined by adopting the classification loss as an energy, then there exists a finite temperature parameter for the distribution such that the classification is optimal, with respect to the loss (or the energy). Interestingly, for the case in which samples are generated by a Gaussian distribution and labels are constructed by employing a teacher perceptron, we analytically prove and numerically confirm that such optimal temperature does not depend neither on the teacher classifier (which is, by construction of the learning problem, unknown), nor on the number of random classifiers, highlighting the universal nature of the observed behavior. Experiments on the MNIST dataset underline the relevance of this phenomenon in high-quality, noiseless, datasets. Finally, a physical analogy allows us to shed light on the self-organizing nature of the studied phenomenon.

artificial intelligence, classifier, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.10331

Country: Europe (0.46)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Di Lillo, Simmaco, Marinucci, Domenico, Salvi, Michele, Vigogna, Stefano

Fractal and Regular Geometry of Deep Neural Networks

arXiv.org Machine LearningApr-8-2025

We study the geometric properties of random neural networks by investigating the boundary volumes of their excursion sets for different activation functions, as the depth increases. More specifically, we show that, for activations which are not very regular (e.g., the Heaviside step function), the boundary volumes exhibit fractal behavior, with their Hausdorff dimension monotonically increasing with the depth. On the other hand, for activations which are more regular (e.g., ReLU, logistic and $\tanh$), as the depth increases, the expected boundary volumes can either converge to zero, remain constant or diverge exponentially, depending on a single spectral parameter which can be easily computed. Our theoretical results are confirmed in some numerical experiments based on Monte Carlo simulations.

artificial intelligence, machine learning, neural network, (19 more...)

2504.0625

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(3 more...)

Genre: Research Report > New Finding (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Neufeld, Ariel, Schmocker, Philipp

Solving stochastic partial differential equations using neural networks in the Wiener chaos expansion

arXiv.org Machine LearningNov-5-2024

In this paper, we solve stochastic partial differential equations (SPDEs) numerically by using (possibly random) neural networks in the truncated Wiener chaos expansion of their corresponding solution. Moreover, we provide some approximation rates for learning the solution of SPDEs with additive and/or multiplicative noise. Finally, we apply our results in numerical examples to approximate the solution of three SPDEs: the stochastic heat equation, the Heath-Jarrow-Morton equation, and the Zakai equation.

equation, neural network, random neural network, (15 more...)

2411.03384

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
(9 more...)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Jeffrey Pennington, Pratik Worah

Nonlinear random matrix theory for deep learning

Neural Information Processing SystemsOct-2-2024, 18:16:33 GMT

Neural network configurations with random weights play an important role in the analysis of deep learning. They define the initial loss landscape and are closely related to kernel and random feature methods. Despite the fact that these networks are built out of random matrices, the vast and powerful machinery of random matrix theory has so far found limited success in studying them. A main obstacle in this direction is that neural networks are nonlinear, which prevents the straightforward utilization of many of the existing mathematical results. In this work, we open the door for direct applications of random matrix theory to deep learning by demonstrating that the pointwise nonlinearities typically applied in neural networks can be incorporated into a standard method of proof in random matrix theory known as the moments method.

activation function, matrix, neural network, (13 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningMay-16-2024

Random ReLU Neural Networks as Non-Gaussian Processes

Parhi, Rahul, Bohra, Pakshal, Biari, Ayoub El, Pourya, Mehrsa, Unser, Michael

We consider a large class of shallow neural networks with randomly initialized parameters and rectified linear unit activation functions. We prove that these random neural networks are well-defined non-Gaussian processes. As a by-product, we demonstrate that these networks are solutions to stochastic differential equations driven by impulsive white noise (combinations of random Dirac measures). These processes are parameterized by the law of the weights and biases as well as the density of activation thresholds in each bounded region of the input domain. We prove that these processes are isotropic and wide-sense self-similar with Hurst exponent $3/2$. We also derive a remarkably simple closed-form expression for their autocovariance function. Our results are fundamentally different from prior work in that we consider a non-asymptotic viewpoint: The number of neurons in each bounded region of the input domain (i.e., the width) is itself a random variable with a Poisson law with mean proportional to the density parameter. Finally, we show that, under suitable hypotheses, as the expected width tends to infinity, these processes can converge in law not only to Gaussian processes, but also to non-Gaussian processes depending on the law of the weights. Our asymptotic results provide a new take on several classical results (wide networks converge to Gaussian processes) as well as some new ones (wide networks can converge to non-Gaussian processes).

neural network, relu, stochastic process, (13 more...)

2405.10229

Country:

North America > United States > New York (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(4 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)